-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major changes to interpolations and resolvers #445
Conversation
This pull request introduces 1 alert when merging ddb57bb into 800e8a7 - view on LGTM.com new alerts:
|
ddb57bb
to
ee83314
Compare
Just a note that I fixed that in a force push (it was making flake8 tests fail as well) -- this was caused by my rebase on top of master |
looks like you have conflicts with master. can you fix? |
ee83314
to
ad826e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, made a review pass.
lots of comments and questions inline. overall looks good.
(Must say this was easier to review now, in part because of the previous review passes in the other PR but also due to the fact that reviewing it as one unit is easier.
Environment variables are parsed when they are recognized as valid quantities that | ||
may be evaluated (e.g., int, float, dict, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you go over why we think (sometimes) parsing the environment variables as opposed to always passing them as is in a string is the the best choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until yesterday I thought that otherwise there would no way to write something like:
learning_rate: ${env:LR}
and have learning_rate
be a float.
That being said, I just realized that {Integer,Float,Boolean}Node
actually convert from string, so maybe this would work with structured configs?
(although I'm not entirely sure why this is allowed in the first place, my first instinct would be that assigning a string to an int/float/bool node should raise an exception)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you can check and see for yourself :).
I think the answer is no, because the result if the interpolation is returned directly and is not assigned to the config.
Converting on assignment was introduced really early and may have been a mistake.
The motivation is to support converting during merge: in particular merging from cli, where everything is a string.
We should consider deprecating support for assignments that mypy would frowns upon if it knew the type.
@dataclass
class Foo:
x: int = 10
cfg: Foo = OmegaConf.structured(Foo)
cfg.x = "20" # mypy would bitch here even though it's legal in OmegaConf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider deprecating support for assignments that mypy would frowns upon if it knew the type.
I created #459 to keep track of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until yesterday I thought that otherwise there would no way to write something like:
learning_rate: ${env:LR}and have
learning_rate
be a float.
Issue 488 is a recent report about this.
I think we should consider validating the return value of custom interpolations against the specified type.
This is a breaking change, so switching to new resolvers is a good opportunity (with a potential opt-out flag - maybe).
As a POC, I gave it a shot in #506 (based on the current interpolation logic). It adds validation for nodes that are annotated (ValueNodes only, and only if they are not AnyNode).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that this actually doesn't work, I'll look into fixing it. Just to be sure, do we want the following to work:
@dataclass
class Cfg:
s: str = "7"
i: int = II("s")
?
(the question is, since s
can be cast to int, is it ok to do it implicitly here as well?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, I reverted the changes that were (attempting at) addressing #488. I will do that in a follow-up PR instead, to avoid adding too much stuff here (since it's already way too big).
As a result, I also believe that it's best to also revisit the behavior of the env
resolver in another follow-up PR, so that when env
is changed to only return strings, then at least typed nodes keep working as expected (even without a new decode
resolver).
@omry if you're ok with that plan then this conversation may be resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify:
There are multiple things here that should probably be changed.
- Support for type annotation for resolvers (with runtime validation).
- Env resolver should no longer parse input environment variables.
- Potentially new built in resolver oc.decode() that can parse an input with the grammar parser.
488 is about 1.
2 and 3 are related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, they will be addressed in that order in future PRs
tests/test_interpolation.py
Outdated
# same as the definition (or `OmegaConf.create()` called on it for lists | ||
# and dictionaries). | ||
# Order matters! (each entry should only depend on those above) | ||
TEST_CONFIG_DATA: List[Tuple[str, Any, Any]] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting to see some parser specific unit tests (testing the parser directly without going through the higher layers).
Did I miss it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I didn't realize it was possible until later, at which point I had already written all tests this way.
Do you want me to refactor them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this one.
Yes - I think having lower level tests will make the tests more robust and easier to work with.
Once you have lower level tests, this entire function can probably be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored these tests in b520815
Essentially I split the big list into 3:
- Stuff that can be parsed without interpolations (to test the basic constructs of the grammar, like int, float, str, dict, etc.). Those can be parsed without using any config object.
- Expressions containing interpolations for the parser rule
singleElement
(this is the biggest chunk) - Finally, expressions for the highest-level parser rule
configValue
In the end it still looks pretty similar overall, but at least it's faster and avoids using dummy resolvers everywhere just to test the correct parser rule. Let me know if that looks ok to you.
NB: in a first attempt I tried to go even lower in the rules (e.g., using the dictContainer
parser rule for dicts), but the problem is that without an EOF
in the parser rule, ANTLR fails to detect some errors (ex: {0: 1}}
is not supposed to be a valid dict). That's why I stuck to the singleElement
and configValue
rules (that contain the EOF
).
3ada218
to
a8ff6f9
Compare
Regarding #448, currently it doesn't work because
|
This is an extension of group@pkg, allowing an empty package in Hydra. group and group@pkg are now going to be supported as interpolation keys in the defaults list (this is non-standard interpolation support). to enable that OmegaConf need to accept @ as a legal character in config keys.
|
Ok thanks, I added this in 3af2f85 (edited: initially it was d959a13 but I realized I could make the grammar more readable) (also added another commit, af3870d, to add a missing test for Regarding Hydra, I tried synching facebookresearch/hydra#1170 with a config.yaml looking like: foo:
x@y: youpla and run the command line python my_app.py foo.x@y=boom and Hydra complains with |
d959a13
to
3af2f85
Compare
Commented in one of the commits, this seems a bit more complicated than I would have wanted. About Hydra: #1170 is still work in progress, the new logic is not integrated there yet. Here is a concrete test that has an interpolation with |
Link to the discussion for reference: d959a13#r44887061
Ok, just wanted to be sure!
I see, thanks! |
200fef2
to
9e1bc41
Compare
I did a force push to replace my old commit adding support for One open question is whether we really want to allow more varied dictionary keys: see discussion at 200fef2#commitcomment-44939891 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recent batch of changes looks good. I still need to have a second pass on your responses to everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial pass. Got about a million lines left to review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got most of it, but ran out of energy for now.
Environment variables are parsed when they are recognized as valid quantities that | ||
may be evaluated (e.g., int, float, dict, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until yesterday I thought that otherwise there would no way to write something like:
learning_rate: ${env:LR}and have
learning_rate
be a float.
Issue 488 is a recent report about this.
I think we should consider validating the return value of custom interpolations against the specified type.
This is a breaking change, so switching to new resolvers is a good opportunity (with a potential opt-out flag - maybe).
As a POC, I gave it a shot in #506 (based on the current interpolation logic). It adds validation for nodes that are annotated (ValueNodes only, and only if they are not AnyNode).
This is also closing #266, right? |
This reverts commit 51c201f.
Will be re-enabled once the behavior of `register_new_resolver()` is finalized. Also renamed some variables for clarity and consistency.
5c03808
to
5cb75f6
Compare
Your fixed string:
Only closed 100. |
Oops, thanks for taking care of it! Looks like we actually need to repeat the "Fixes / Closes" keyword (https://stackoverflow.com/questions/3547445/closing-multiple-issues-in-github-with-a-commit-message) |
This is a squashed version of #321.
env
resolver now parses environment variables in a way that is consistent with the new interpolation grammarFixes #100 #230 #266 #318